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Introduction 

When dealing with imperfect data and general models of dynamic systems, the best es- 
timate is always sought in the presence of uncertainty or unknown parameters. In many 
cases, as the first attempt, the Extended Kalman filter (EKF) provides sufficient solutions 
to handling issues arising from nonlinear and non-Gaussian estimation problems. But these 
issues may lead unacceptable performance and even divergence. In order to accurately cap- 
ture the nonlinearities of most real-world dynamic systems, advanced filtering methods have 
been created to reduce filter divergence while enhancing performance. Approaches, such 
as Gaussian sum filtering, grid based Bayesian methods and particle filters are well-known 
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examples of advanced methods used to represent and recursively reproduce an approxima- 
tion to the state probability density function (pdf). Some of these filtering methods were 
conceptually developed years before their widespread uses were realized. Advanced nonlin- 
ear filtering methods currently benefit from the computing advancements in computational 
speeds, memory, and parallel processing. 

Grid based methods, multiple-model approaches and Gaussian sum filtering are numerical 
solutions that take advantage of different state coordinates or multiple-model methods that 
reduced the amount of approximations used. Choosing an efficient grid is very difficult for 
multi-dimensional state spaces, and oftentimes expensive computations must be done at each 
point. For the original Gaussian sum filter, a weighted sum of Gaussian density functions 
approximates the pdf but suffers at the update step for the individual component weight 
selections [1] . In order to improve upon the original Gaussian sum filter, Ref. [2] introduces 
a weight update approach at the filter propagation stage instead of the measurement update 
stage. This weight update is performed by minimizing the integral square difference between 
the true forecast pdf and its Gaussian sum approximation [2], By adaptively updating each 
component weight during the nonlinear propagation stage an approximation of the true pdf 
can be successfully reconstructed. 

Particle filtering (PF) methods have gained popularity recently for solving nonlinear 
estimation problems due to their straightforward approach and the processing capabilities 
mentioned above [3]. The basic concept behind PF is to represent any pdf as a set of 
random samples. As the number of samples increases, they will theoretically converge to 
the exact, equivalent representation of the desired pdf. When the estimated q th moment 
is needed, the samples are used for its construction allowing further analysis of the pdf 
characteristics [4], However, filter performance deteriorates as the dimension of the state 
vector increases. To overcome this problem Ref. [5] applies a marginalization technique for 
PF methods, decreasing complexity of the system to one linear and another nonlinear state 
estimation problem. 

The marginalization theory was originally developed by Rao and Blackwell independently. 
According to Ref. [6] it improves any given estimator under every convex loss function. The 
improvement comes from calculating a conditional expected value, often involving integrat- 
ing out a supportive statistic. In other words, Rao-Blackwcllization allows for smaller but 
separate computations to be carried out while reaching the main objective of the estimator. 
In the case of improving an estimator’s variance, any supporting statistic can be removed 
and its variance determined. Next, any other information that dependents on the supporting 
statistic is found along with its respective variance [6]. 

A new approach is developed here by utilizing the strengths of the adaptive Gaussian 
sum propagation in Ref. [2] and a marginalization approach used for PF methods found in 
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Ref. [7]. In the following sections a modified filtering approach is presented based on a special 
state-space model within nonlinear systems to reduce the dimensionality of the optimization 
problem in Ref. [2]. First, the adaptive Gaussian sum propagation is explained and then 
the new marginalized adaptive Gaussian sum propagation is derived. Finally, an example 
simulation is presented. 


Gaussian Sum Nonlinear Model Propagation 


In this section, the approach from Ref. [2] is used to approximate the pdf for nonlinear 
systems. The idea is to approximate the required posterior pdf by a Gaussian mixture , which 
is a weighted sum of Gaussian density functions [8]. According to Refs. [2,9], with a sufficient 
collection of Gaussian components, any pdf may be approximated as closely as desired. For 
this section the notation found in Ref. [2] is followed with minor differences in symbols, in 
order to present their new method. Given a pdf p(xfc), the Gaussian approximation is 

pM ~J2 wi k N{*kK,pi) (i) 

i = 1 


where the i th mean and covariance are denoted by x^, and P k , respectively. In order to have a 
valid pdf, the following requirement must be satisfied: that for some q and positive weights, 
Wi, Yli=i w k = 1- Each Gaussian distribution, A/"(x[.|x^,, PJf) is given by 


jv ( 4 i nxo = 


Mir* 


exp 


~2 ( x “ x fc )( P fc ) ( x ~ x fc ) 


( 2 ) 


The pdf at time k+1 to be approximated as a Gaussian mixture is given by the Chapman- 
Kolmogorov equation [8]: 


p(xfe+i) = / p(x fc+ i|x fc )p(x fc ) dx fc 


( 3 ) 


where p(xfc + i|xfc) is the probabilistic model of the state evolution (also known as state tran- 
sition pdf). Consider the nonlinear model of the general form: 


Xfc-fl = f(x fc ,U fc ,fc) +T] k 


( 4 ) 


The state transition pdf depends on the pdf for the process noise variable usually modeled 
as an additive zero-mean Gaussian noise processes with covariance Q k [2], Thus, the state 
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transition pdf can be written as 


p(x fc+ i|x fc ) = A/'(x fc+ i|f(x fc , Ufc, k), Qk ) 


( 5 ) 


In the traditional Gaussian sum approximation, the forecast density function, p(x*, +1 ), is 
obtained by linearizing the nonlinear transformation and assuming that the weights of the 
different components are constant, i.e. , 


P(x fc +i) = ^Wfc +1 AT(x fc+ i| X fc+1> 1) 


( 6 ) 


2=1 


where 


Wfc+l = w, 


4+1 = f(4+i. u i+i,*: + 1) 
= F*(x‘)^Ff(x‘) + g* 


(7a) 

(7b) 

(7c) 


and 


F,fx l ) = 


d f ( x fc> u fc, fc ) 

9x1 


( 8 ) 


One of the advantages of the Gaussian sum approximation is, in a low-noise environ- 
ment, the resulting estimator can be very nearly optimal. Conversely, the problem is, how to 
formulate an algorithmic procedure for on-line computation of updating the weights. This 
computation is difficult due to the fact that the number of components q can grow ex- 
ponentially with time [2,8]. Therefore, to obtain a favorable posterior pdf approximation 
developing better update laws for the future weights is necessary for enhancing performance. 


Weight Update for Discrete-Time Dynamic Systems 

While the traditional Gaussian sum approximation keeps the weights constant, Ref. [2] 
introduces a weight update scheme that utilizes the Chapman-Kolmogorov equation. That 
is, given the Gaussian sum approximation at some time k as 

PM = w MM^l p l) ( 9 ) 

i= 1 

the Gaussian approximation as time k + 1 may be written as 

<? 

P(Xfc+l) = ^2 W k+l^f (Xfc+i |Xfc +1 , P k+l) ( 10 ) 

2=1 
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The approach in Ref. [2] finds the new weights of the Gaussian mixture such that the dif- 
ference between the true pdf in Eq. (3) and the approximated pdf in Eq. (10) is minimized 
over the domain, x*; +1 , i.e. , 


min J = ~ / ||p(x fc+1 ) -p(x fc+ i)|| 2 dx fe+ i 

(Ha) 

W k+1 ^ J 


Q 

s.t. y w \+i = 1 
2=1 

(lib) 

++i >0, i = 1, . . . , q 

(11c) 


The terms in the above cost function can be expanded and rewritten as 

J = 1 / |p 2 (x* + i) - 2 p(x k+1 )p(x k+i ) + p 2 (x fe+1 ) dx i+1 


( 12 ) 


Substituting Eq. (10) and Eq. (3) gives 


\ j P 2 (x fc +i) dx fc+1 = ^ 


k+ 1) Pk+ 1) 

J L i = 1 


dx, 


fc+i 


(13) 


and 

2p(x fc+ i)p(x fc+ i) = 2 


p(x fc+ i|x fc )p(x fc ) dx fc 


y:<+i^(x fc +iix 

fc+1: Pf 


fc+1 ) 


J L 2=1 


2=1 


fc+1 


p(x fc+1 |x fc )7V(x fc+1 |x‘ fc+1 , p l k+l )(ht k+l 


dxfc+i 

(14a) 

p(x fc )dx fc (14b) 


which are quadratic and linear terms, respectively. Each linear term from the summation is 
denoted as y j. Substituting Eq. (5) yields 


Vi = 


A+x fc+ i|f(x fc ,u fc , k),Q k )J\f(^k+i\^k + i,P l k+ i) dxfc + i 


p(xfc) dx fc 


A/'(f(x fc , u fc , fc)|x fe+ i, P^. +1 + Q fc ) 


p(xfc) dx fc 


(15a) 

(15b) 


The first term in the p 2 (xfc + i) expression is an additive constant and excluded from the 
optimization problem. Now the cost function can be written as 


J = ^ Wfc+i Afw fc+1 - w l +l y 


(16) 
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where w fc+1 = K +1 w 2 k+1 

a symmetric matrix given by 


w k+i\ I i y contains the components i/i, and M G is 


m — m(xi 1+1 )m T (x t+1 )dx l+I 


(17) 


where VJt is a g x 1 vector that contains all the Gaussian components at time k + 1: 


mt= [jv(x fc+1 | x fc+1 ,p fc+1 ) ^/■(x fc |x fc+1 ,p fc+1 ) 


A/"(x fc+1 |x^ X )] T (18) 


As a result the components of M are given by the product rule of two Gaussian density 
functions which yield another Gaussian density function [10]. Integrating the product leaves 
only the normalization constants performed. The off-diagonal elements of M are given by 


mij= / A/" (x fc+ i |x* fc+1 , P l k+1 )N' (xfc+i | 


x% +1 , P J k+1 ) dx k+1 ,i^j 


(19a) 


= AT(x 




k+i | x fc+n -ffe+i + Pi 


U) 


X / AT(x* +1 |P^ +1 [(P* +1 ) X l fe+1 + (P^ +1 ) X^ +1 ],P* J +1 ) dx fe+1 


— AT(x fc+1 |x J fc+1 , Pfc +1 + Pfe +1 ) 
= ||2vrP^ +1 ||-texp --(x 


k + 1 


j ^ T (P ; S 


fc+ii 




‘■fe+l 


(19b) 

(19c) 


where P^ +1 = P^ +1 + P^ +1 and P* J = [(P^ +1 ) 1 + (P^ +1 ) *] x . The diagonal elements of M 
are given by 


m. 


=v(x , t+1 |xi +1 ,py 1 + f>y 1 ) 




i || —1/2 


( 20 ) 


Returning to the linear term w k+1 y , if the prior pdf is approximated by a Gaussian sum 
then 


Vi = 


•A/’(f( x fe, u fc , k)\x. l k+1 , Pl +1 + Q k ) 
AA(f(x fc , u k , k) |x* fc+1 , Pfc +1 + Q k ) 


p(x fc ) dx fc 


p(xfc) dx fc 


(21a) 

(21b) 


Therefore, 


J/i = ^2 w k / >A/"(f (x*:, Ufe, fc)|x^. + i, Pfe + i + Qfc)A/"( x fc|x^, Pfc) dx fc 


2=1 


(22a) 
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(22b) 


= E 

i— 1 

The vector w fe = • • • u’][] T is the prior weight vector and the matrix N E M 9 * 9 

contains the following components: 


Nij = / J\f(f(x k ,u k ,k)\xl +1 ,Pl +1 + Q k )J\f(x k \x.{,Pl) dx fc 


= E 


'•^(Xfcl x J fc ,-Pfe) 


A(f(x fc ,u fe ,fc)|x^ + 1 ,P ^ +1 + Q fe ) 


(23a) 

(23b) 


Note that an Unscented Transformation 3, (UT) can be used to compute the expectations of 
Eq. (23b). The final optimization problem is presented in the quadratic programming form 
as 


min J(w' k+1 ) = ^wl +1 Mw k+1 - w^ +1 Nw k 

S-t- IqXlWi+i = 1 
w * +1 > Og X i, i = l,...,q 

where l qx i is a vector of ones and 0 gx i is a vector of zeros. 


(24a) 

(24b) 

(24c) 


Rao-Blackwellization Approach 

A Gaussian sum filter approach allows an initial reduction of a computation complexity 
versus the classical Sequential Monte Carlo (SMC) method using the sequential importance 
sampling (SIS) algorithm. This is because a SMC method will approximate a continuous 
distribution of interest by a finite (but large) number of weighted random samples or par- 
ticles in the state space. In theory any SIS method can approximate the posterior pdf of 
any form and solve any nonlinear system with any arbitrary distribution, i.e. the PF. Also, 
incorporating the familiar Bayesian approach known as Rao-Blackwellization, the SMC com- 
plexity can also be reduced by marginalizing out the conditional linear parts of the nonlinear 
model [6]. This results in a Rao-Blackwellized Particle Filter (RBPF) where the linear por- 
tion is estimated using a Kalman Fliter (KF), and the nonlinear portion using the original 
PF as mentioned earlier [7,8,11,12], In a similar goal of computational effort reduction for 
the RBPF, a Rao-Blackwellized Adaptive Gaussian Sum (RB-AGS) approach can allow a 
reduction of the number of Gaussian components by propagating linear portions using the 
KF and the nonlinear parts by the original EKF or Unscented Kalman Filter. Therefore, the 

a See Ref. [2] for caveats concerning the UT. 
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Rao-Blackwellization and the new update laws will prove useful for some unique nonlinear 
systems. 

Assume that the nonlinear system from Eq. (4) can be decomposed into a general state 
space model: 


x L+i = f fc( x fe, u fc) + vi 

X fc+ 1 = f fc( x fc,U fc ) +»$ 


where the process noise r]k is 


Ik = 


~ A/”(0, Q k ), 

Qk 

i 

O 

Q> 


Ik 



G> 

o 


(25a) 

(25b) 


(26) 


Here, obtaining the pdf p (x*,) = p (x(., xjj) is desired. The pdf p (x*,) is calculated using 
the following two steps. First it is assumed, at time k, the Gaussian sum approximation of 
p (x£) is given by 


p(xj*) = £>i (27) 

i=l 

for some chosen mean and covariance Now the forecast approximation, p (x^ +1 ), is 
calculated using the Chapman-Kolmogorov equation: 

P ( x fc+i) = j P ( x fc+il x fc) P K) rfx fc (28) 

Note that the precise expression for the conditional pdf, p (x£ +1 |x£), can be written as 


p {- > 


-fc+i I 


))=V(x2 +1 |f t ”(x;,u t ),«) 


(29) 


Following the same approach previously shown, the weights corresponding to individual 
Gaussian components can be obtained from solving the following optimization problem: 


min J — - / ||p(x” +1 ) -p(x ^ +1 )|| 2 dx £ +1 

(30a) 

w k+i z J 




s.t. y w \+i = 1 

i = 1 

(30b) 

s 

+ “■ 

IV 

o , 

(30c) 
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The above optimization problem can be written in vector format as 


min J(w* k+1 ) = „w[ +l Mw( c+l - w£ +1 AGv fc 
w \+ 1 2 

s-t. iJ’xiwJ+i = 1 

W- +1 > Ogxi, i = 1, . . . , q 


(31a) 

(31b) 

(31c) 


where w fc+ i = [w\ +1 w^+i ■ ■ ■ w q k+1 ] T , and w k = [w\ w k ■■■ w 9 k ] T is the prior 
weight vector. As before the symmetric Gaussian components for time k + 1 are 


M = / 9K(x£ +1 )9Jr (x£ +1 ) dx. n k+1 


and 911 is a q x 1 vector that contains all the Gaussian components at time k + 1: 


911— [Af (x£ +1 1 n k+1 , E fc+1 ) Af (x£|/z fe+1 , £ fc+1 


Once again, for M the off-diagonal terms are 


rriij = ||27tE^ + 1 || 2 e xp 


--(/4+ 1-Mfc+i) ( s fc+i) (Mfc+1 — Mfc+i) 


where £* +1 = £’ fc+1 + £fc +1 and the diagonal terms reduce to 


~ ' ' ^(Mfc+llMfc+l) ^fc+l + ^fc+l) 

= ||4irEi +1 ir 1/2 

Next the linear term w k+1 y is made up of the individual components given by 
K = £ < f JVI/rtxJ.u*, fe)|M t+1 . E‘ +1 + Q2) V«|,4, S{) <ix£ 

i = 1 •' 

= £ 

Z=1 

Now with w^ +1 y = w k+1 Nw k , each component can be seen as 

Nij = [ u fc , fc)|M fe+1 , 4 + i + <3fc) JVKIK, Ej) dxJJ 


= E 




AA(/nx^,u fe ,A;)| Mfc+1 ,4 +1 + g^ 


(32) 


AT( IK+i.Ey i )] T (33) 


(34) 


(35) 


(36a) 

(36b) 


(37a) 

(37b) 


where the expectation above can be computed by the UT. Note that only the Gaussian 
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sum approximation is used for the marginalized nonlinear portion of special case state-space 
model. 

The joint pdf p (xj,, xjj) can be calculated as 

P (xi + i, x^ +1 ) = I p (x[. +1 , x£ +1 |x£) p (x2) dx£ (38) 

Note that since the conditional pdf p (x^. +1 ,x£ +1 |x£) is Gaussian and an analytical expres- 
sion, it can be easily obtained as 




-fc+i’ ^fc+i I 


i)=m 


fiW.up 





(39) 


Substituting the Gaussian sum approximation for p (x£), an approximation for the joint pdf, 
p (xfc), can be written as 


P ( x i+i. x fc + 1 


q r 

) = £</ 

»= i 17 


AT 


ffcixi:, u 


n n n\ 
/c 5 ; 


pn/ n n 
l /c V X /c> u /c 


QL 0 

o Qk 


^(x)f|/ii,Ei)dx2 (40) 


The above integral can be considered as an expectation since 


•A/"(ffc( x fc, u fe), Qk)N (xfclAtfe, Sfc) dxj! = |^ )S i) 


AA(f fc (x^,<),Q fe ) 


(41) 


where 

fh Y n n n 

f n(„n ..n 
1 fc 5 u fc 

Thus, the marginalized Gaussian sum filter decreases the dimensionality of the opti- 
mization problem in Ref. [2] but involves an extra expectation type integral in Eq. (41) 
involving Gaussian pdfs, which can be evaluated using sigma points from the UT. How- 
ever, the marginalized Gaussian sum approximation may yield more accurate results when 
compared to the full-state Gaussian sum approximation using the same number of Gaussian 
components for the nonlinear portion of the state-space model. b The increase in accuracy is 
due to the fact that, in the marginalized approach, the same number of Gaussian components 
are used to approximate a lower-dimensional pdf. 

b Based on the assumption that increasing the number of Gaussian components will improve the approx- 
imation. 


l x fc> u *J 
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Simulation Results 


As an example, consider a nonlinear free fall and parachute model. An 80-kg paratrooper 
is dropped from an airplane at a height of 1200 m. After 5 seconds, the chute opens. The 
paratrooper’s height as a function of time, y(t), is given by 


y(t) = -g + a(t)/m (42) 

?/(0) = 1200 nr 

?/(0) = 0 m/s 


where g = 9.81 m/s 2 is the acceleration due to gravity and m = 80 kg is the paratrooper’s 
mass. The air resistance a(t ) is proportional to the square of the velocity, with different 
proportionality constants before and after the chute opens: 


a(t ) 


K 1 y(t) 2 , t< 5 s 

K 2 y{t) 2 : t> 5 s 


(43) 


where K\ = 1/15 and K 2 = 4/15. 

A Monte Carlo simulation is run using 10,000 different initial conditions for a time du- 
ration of 15 seconds, with a At = 0.1 seconds. The truth values are computed by taking 
the average of the Monte Carlo (MC) data points at each time step for the nonlinear model 
propagation. The simulations for the RB-AGS filter are run 1,000 times using random initial- 
izations for the Gaussian components used. Next the average is computed for both methods 
and used as the overall MC estimated state values and covariances. Lastly, the errors are 
given by the difference between the MC truth and MC estimate. 

In Figures 1(a) and 1(b), the evolution of the pdf of the RB-AGS is shown which clearly 
indicates the non-Gaussian nature of the problem. In Figure 2 the propagated estimate 
from the two filters are plotted for the last few iterations. The top plot shows the position 
estimates, which are both close to the MC solution at each time step. However, the bottom 
plot reveals the RB-AGS estimate follows the MC values closer than the standard AGS as 
expected. The resulting estimates only differ by approximately a few tenths of a meter per 
second between each other, but still obtains a positive improvement in accuracy. 


11 of 14 


-8.5 


PDF of Rao-Blackwellized filter at 1 s 


PDF of Rao-Blackwellized filter at 9 s 


-51.6 




1193 1193.5 1194 1194.5 1195 1195.5 1196 1196.5 

Position (m) 


878.8 879 879.2 879.4 879.6 879.6 

Position (m) 


(a) PDF of RB-AGS Filter After 1 Second (b) PDF of RB-AGS Filter After 9 Seconds 

Figure 1. PDF Evolution of RB-AGS Filter 




Figure 2. Comparison of the MC Runs and the Propagation Estimates 


12 of 14 



Conclusions 


A marginalized adaptive Gaussian sum filter was presented here. With this new method 
the linear portion of the state-space model is propagated using the linear Kalman filter and 
the nonlinear portion using the Unscented Kalman Filter for each Gaussian component. 
Reducing the linear portion of the state-space model to linear propagation equations places 
the computational efforts on the nonlinear equations. Simulation results involving parachute 
model indicates that more accurate results are obtained using the marginalized approach 
versus the non-marginalized approach. The computational burden using the marginalized 
approach may be higher or lower than the non-marginalized approach, which is problem 
dependent. 
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